42 research outputs found

    vsgoftest: An R Package for Goodness-of-Fit Testing Based on Kullback-Leibler Divergence

    Get PDF
    The R package vsgoftest performs goodness-of-fit (GOF) tests, based on Shannon entropy and Kullback-Leibler divergence, developed by Vasicek (1976) and Song (2002), of various classical families of distributions. The so-called Vasicek-Song (VS) tests are intended to be applied to continuous data - typically drawn from a density distribution, even including ties. Their excellent properties - they exhibit high power in a large variety of situations, make them relevant alternatives to classical GOF tests in any domain of application requiring statistical processing. The theoretical framework of VS tests is summarized and followed by a detailed description of the different features of the package. The power and computational time performances of VS tests are studied through their comparison with other GOF tests. Application to real datasets illustrates the easy-to-use functionalities of the vsgoftest package

    äž€èˆŹé†«ć­žïŒšé†«ç™‚è™•çœź

    No full text
    International audienceThis paper mainly aims at unifying as a unique goodness-of-fit procedure the tests based on Shannon entropy–called S-tests–introduced by Vasicek in 1976, and the tests based on relative entropy–or Kullback-Leibler divergence, called KL-tests–introduced by Song in 2002. While Vasicek’s procedure is widely used in the literature, Song’s has remained more confidential. Both tests are known to have good power properties and to lead to straightforward computations. However, some asymptotic properties of the S-tests have never been checked and the link between the two procedures has never been highlighted. Mathematical justification of both tests is detailed here, leading to show their equivalence for testing any parametric composite null hypothesis of maximum entropy distributions. For testing any other distribution, the KL-tests are still reliable goodness-of-fit tests, whereas the S-tests become tests of entropy level. Moreover, for simple null hypothesis, only the KL-tests can be considered. The methodology is applied to a real dataset of a DNA replication process, issued from a collaboration with biologists. The objective is to validate an experimental protocol to detect chicken cell lines for which the spatiotemporal program of DNA replication is not correctly executed. We propose a two-step approach through entropy-based tests. First, a Fisher distribution with non integer parameters is retained as reference, and then the experimental protocol is validated

    vsgoftest: An R Package for Goodness-of-Fit Testing Based on Kullback-Leibler Divergence

    No full text
    The R-package vsgoftest performs goodness-of-fit (GOF) tests, based on Shannon en-tropy and Kullback-Leibler divergence, developed by Vasicek (1976) and Song (2002), of various classical families of distributions. The theoretical framework of the so-called Vasicek-Song (VS) tests is summarized and followed by a detailed description of the different features of the package. The power and computational time performances of VS tests are studied through their comparison with other GOF tests. Application to real datasets illustrates the easy-to-use functionalities of the vsgoftest package

    Information-Based Parametrization of Log-Linear Models for Categorical Data Analysis

    No full text
    International audienceZighera (App Stoch Mod Data Anal 1:93–108 1985) introduced a new parameterization of log-linear models for analyzing categorical data, directly linked to a thorough analysis of discrimination information through Kullback-Leibler divergence. The method mainly aims at quantifying in terms of information the variations of a binary variable of interest, by comparing two contingency tables – or sub-tables – through effects of explanatory categorical variables. The present paper settles the mathematical background necessary to rigorously apply Zighera’s parameterization to any categorical data. In particular, identifiability and good properties of asymptotically χ 2-distributed test statistics are proven to hold. Determination of parameters and all tests of effects due to explanatory variables are simultaneous. Application to classical data sets illustrates contribution with respect to existing methods

    Sur la recherche de φ-entropie Ă  maximisante donnĂ©e

    No full text
    National audienceIn this paper, we are interested in maximum entropy problems under moment constraints. Contrary to the usual problem of finding the maximizer of a given entropy, or of selecting constraints such that a given distribution is a maximizer, we focus here on the determination of an entropy such that a given distribution is its maximizer. The goal is in some sense to adapt the entropy to its maximizer, with potential application in entropy-based goodness-of-fit tests. It allows us to consider distributions out the exponential family – to which the maximizers of the Shannon entropy belong, and also to consider simple moment constraints, estimated from the observed sample. Finally, this approach also yields entropic functionals that are function of both probability density and state, allowing us to include skew-symmetric or multimodal distributions in the setting.Nous nous intĂ©ressons ici au problĂšme de lois d’entropie maximum, sous contraintes de moments. Contrairement au problĂšme usuel de recherche de maximisante d’une entropie donnĂ©e, ou de contraintes pour qu’une loi fixĂ©e soit maximisante, nous considĂ©rons la recherche de l’entropie elle-mĂȘme telle qu’une loi donnĂ©e en soit sa maximisante. Il s’agit en quelques sorte d’adapter l’entropie Ă  la maximisante. Cette approche trouve potentiellement des applications dans les problĂšmes de tests d’adĂ©quation basĂ©s sur des critĂšres entropiques. Elle permet de sortir du cadre des lois de la famille exponentielle, correspondant aux maximisantes de l’entropie de Shannon, et Ă©galement de se limiter Ă  des contraintes de moment simples, en pratique estimĂ©s Ă  partir de l’échantillon observĂ©. Cette approche nous conduit enfin Ă  dĂ©finir des fonctionnelles entropiques fonction Ă  la fois de la densitĂ© de probabilitĂ© et de l’état, permettant de traiter des lois non symĂ©triques ou multimodales

    Sur la recherche de φ-entropie Ă  maximisante donnĂ©e

    No full text
    National audienceIn this paper, we are interested in maximum entropy problems under moment constraints. Contrary to the usual problem of finding the maximizer of a given entropy, or of selecting constraints such that a given distribution is a maximizer, we focus here on the determination of an entropy such that a given distribution is its maximizer. The goal is in some sense to adapt the entropy to its maximizer, with potential application in entropy-based goodness-of-fit tests. It allows us to consider distributions out the exponential family – to which the maximizers of the Shannon entropy belong, and also to consider simple moment constraints, estimated from the observed sample. Finally, this approach also yields entropic functionals that are function of both probability density and state, allowing us to include skew-symmetric or multimodal distributions in the setting.Nous nous intĂ©ressons ici au problĂšme de lois d’entropie maximum, sous contraintes de moments. Contrairement au problĂšme usuel de recherche de maximisante d’une entropie donnĂ©e, ou de contraintes pour qu’une loi fixĂ©e soit maximisante, nous considĂ©rons la recherche de l’entropie elle-mĂȘme telle qu’une loi donnĂ©e en soit sa maximisante. Il s’agit en quelques sorte d’adapter l’entropie Ă  la maximisante. Cette approche trouve potentiellement des applications dans les problĂšmes de tests d’adĂ©quation basĂ©s sur des critĂšres entropiques. Elle permet de sortir du cadre des lois de la famille exponentielle, correspondant aux maximisantes de l’entropie de Shannon, et Ă©galement de se limiter Ă  des contraintes de moment simples, en pratique estimĂ©s Ă  partir de l’échantillon observĂ©. Cette approche nous conduit enfin Ă  dĂ©finir des fonctionnelles entropiques fonction Ă  la fois de la densitĂ© de probabilitĂ© et de l’état, permettant de traiter des lois non symĂ©triques ou multimodales
    corecore